Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XDM: Relayer and Fraud proof updates #2562

Merged
merged 6 commits into from
Mar 5, 2024
Merged

XDM: Relayer and Fraud proof updates #2562

merged 6 commits into from
Mar 5, 2024

Conversation

vedhavyas
Copy link
Member

@vedhavyas vedhavyas commented Feb 27, 2024

This PR brings following updates:

  • Update relayer to generate XDM proofs coming from both Consensus and Domain chain and gossips them
  • Enable Fraud proof verification for XDM
  • Enable tests on pallet-messenger

Note:
The first commit is a little packed due to very invasive changes to relayer. Initial was two commits but did not make any sense keeping the changes separate and instead merged to one.

With this Cross domain messaging can be enabled for testing but there are few more issues we need to handle

  • Balance reserve for Channel open
  • Permissioned channel opening between chains before we enable permissionless domains
  • Prune old MMR leaves through XDM message expiration

Code contributor checklist:

Copy link
Member

@NingLin-P NingLin-P left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense in general

domains/client/relayer/src/worker.rs Outdated Show resolved Hide resolved

Ok(Proof::Consensus {
consensus_chain_mmr_proof: ConsensusChainMmrLeafProof {
consensus_block_hash: finalized_block.1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field seems never used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused because the MMR proof is self contained but I wanted to keep the hash of finalized block for light client or other who are tracking MMR roots to be able to verify the proof stateless

domains/client/relayer/src/lib.rs Outdated Show resolved Hide resolved
.into_iter()
.filter(|(number, _)| *number <= relay_block_until)
.collect();
Relayer::fetch_unprocessed_consensus_blocks_until(&client, chain_id, number, hash)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR but I'm thinking about we may not really need to track and process every block since the XDM is not extracted from the block body but instead from the state, and as long as the XDM is not responded it won't be removed from the state.

What we want is some retry in case the XDM is dropped silently in the network during relaying, and we can always retry with the latest finalized consensus block when the notification is come.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XDM will remain in the state when un responded, I'm not sure I clearly follow what you mean. As and when a consensus block is finalized, we relay the XDM from those blocks.

I realized we still need a fallback mechanism for when the message was dropped sliently. What we can do here is to run a simultaneous process that re-submits the XDM. I will handle this in seperate PR and there is an issue created for this already I believe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XDM will remain in the state when un responded, I'm not sure I clearly follow what you mean

I mean we don't need the aux storage for tracking whether a block is processed or not.

Since the XDM is available in the state, whenever a consensus block import/finalization notification comes, we can get the XDM from the state based on the imported/finalized block, filter already relayed XDM, and relay the XDM. If an XDM is dropped silently during propagation then when the next consensus block import/finalization notification comes we will retry to relay this XDM without needing another worker. And yeah let's handle this in a seperate PR.

Comment on lines +149 to +158
// check if this domain block is already relayed
if Relayer::fetch_domains_blocks_relayed_at(
&domain_client,
domain_id,
domain_block_number,
)
.contains(&domain_block_hash)
{
return Ok(None);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An edge case here, if the domain stops progressing (like not activity) then we only try to relay the XDM in the last confirmed domain block for once then mark the block processed and skip it afterward without retrying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. We need a fallback mechanism that is not built in yet as I have mentioned in the other comment. Will handle that fallback mechanism when we do a cleanup of the aux storage

// then fetch new messages in the block
// construct proof of each message to be relayed
// submit XDM as unsigned extrinsic.
while let Some(block) = chain_block_import.next().await {
while let Some(block) = chain_block_finalization.next().await {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC there only be one consensus block finalization notification per segment, so the finalized consensus block will not come like #1,#2,.. but instead #1000,#2000,.. cc @nazar-pc, perhaps use the block import notification with something like block_to_process = imported_block - K

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine. Relayer does not expect the block come iincremental. We just need the latest finalized block and we process all the blocks that in between last processed to latest finalized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may cause problem, if there is one new segment per 1 hour then all relayers from any domain will idle for 1 hour then start relaying XDM at the same time, which brings additional delay and network congestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is correct, XDM is already quiet delayed due to block confirmation and we expliclty want to wait until consensus block is finalized.

notification with something like block_to_process = imported_block - K

This approach defeats the purpose of waiting for consensus block finalization but rather relying on Kdepth. I'm not sure why we moved away from it but I would like to still relay on finalization instead of block_import

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree using block finalization is better in term of readability, but the consensus block finalization is coupled with archiving and differs from what is provided by substrate by default. Besides the above concern about additional delay and network congestion, also consider if an XDM is dropped silently it needs to wait for another new segment before the next retry.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also consider if an XDM is dropped silently it needs to wait for another new segment before the next retry.

I dont see any concern here. There will always be a retry of the XDM in the next turn and XDM will end being included in the chain. This delay is what we opted for as a base XDM using confirmation and finalization.
We expect external services to be built on top to speed up these transfers at higher incentive and is not a concern for the Subspace.

BUt feel free to propose any potential alternatives on making this approach faster and I would like to understand what can be done here better

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will always be a retry of the XDM in the next turn and XDM will end being included in the chain.

The difference is one retry per block versus one retry per segment.

This delay is what we opted for as a base XDM using confirmation and finalization.
BUt feel free to propose any potential alternatives on making this approach faster and I would like to understand what can be done here better

IMO, the hard required delay is: domain block confirmation (challenge period) + consensus block confirmation (K block depth) while the consensus block finalization is: K blocks depth + blocks of one segment. Simply using the consensus block import notification + K block depth can make it faster, non-blocker.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the hard required delay is: domain block confirmation (challenge period) + consensus block confirmation (K block depth)

I disagree. The hard requirement for XDM v2 is safety over fastness. This means, finalization rather than using K-depth.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, K blocks depth is when a consensus block is confirmed and inrevertable which is what we need for XDM (and also used in many other places), while finalization is coupled with the archiving process to ensure the block is not pruned before archiving, finalization take much longer than K blocks but doesn't mean confirmed and non-finalized block is revertable. cc @nazar-pc in case my understanding is wrong.

Non-blocker for using finalization block as the additional delay is not a concern, plz resolve the conflict though.

@vedhavyas vedhavyas requested a review from NingLin-P March 1, 2024 05:26
@vedhavyas vedhavyas added this pull request to the merge queue Mar 5, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Mar 5, 2024
@vedhavyas vedhavyas added this pull request to the merge queue Mar 5, 2024
Merged via the queue into main with commit 0d22229 Mar 5, 2024
11 checks passed
@vedhavyas vedhavyas deleted the xdm_mmr_2 branch March 5, 2024 17:28
@@ -1,21 +1,22 @@
#![warn(rust_2018_idioms)]
#![deny(unused_crate_dependencies)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to remove this line. It results in following confusing error:

error: external crate `alloc` unused in `domain_client_message_relayer`: remove the dependency or add `use alloc as _;`
  |
note: the lint level is defined here
 --> domains/client/relayer/src/lib.rs:2:9
  |
2 | #![deny(unused_crate_dependencies)]
  |         ^^^^^^^^^^^^^^^^^^^^^^^^^

error: external crate `compiler_builtins` unused in `domain_client_message_relayer`: remove the dependency or add `use compiler_builtins as _;`

error: external crate `panic_unwind` unused in `domain_client_message_relayer`: remove the dependency or add `use panic_unwind as _;`

error: external crate `proc_macro` unused in `domain_client_message_relayer`: remove the dependency or add `use proc_macro as _;`

error: external crate `test` unused in `domain_client_message_relayer`: remove the dependency or add `use test as _;`

error: could not compile `domain-client-message-relayer` (lib test) due to 3 previous errors
warning: build failed, waiting for other jobs to finish...
error: could not compile `domain-client-message-relayer` (lib) due to 5 previous errors

To reproduce -Z build-std is needed, for example:

cd domains/client/relayer
cargo clippy --all-targets -Z build-std --target x86_64-unknown-linux-gnu

I'm not sure where it comes from because there is no such things in the crate and I don't think any of the very few macros generate it either (I checked macro expansion). Might be compiler issue: rust-lang/rust#122105

@vedhavyas vedhavyas added the need to audit This change needs to be audited label Apr 5, 2024
@vanhauser-thc
Copy link
Collaborator

relevant: #2660
LGTM otherwise however due to the overall high complexity we will do a deep dive into XDM once the feature is stable.

@vanhauser-thc vanhauser-thc added audited This change was audited and removed need to audit This change needs to be audited labels Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
audited This change was audited
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants